Automatic Enrichment of Very Large Dictionary of Word Combinations on the Basis of Dependency Formalism

نویسندگان

  • Alexander F. Gelbukh
  • Grigori Sidorov
  • Sang-Yong Han
  • Erika Hernández-Rubio
چکیده

The paper presents a method of automatic enrichment of a very large dictionary of word combinations. The method is based on results of automatic syntactic analysis (parsing) of sentences. The dependency formalism is used for representation of syntactic trees that allows for easier treatment of information about syntactic compatibility. Evaluation of the method is presented for the Spanish language based on comparison of the automatically generated results with manually marked word combinations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Syntactic Analysis for Detection of Word Combinations

The paper presents a method for automatic detection of “non-trivial” word combinations in the text. It is based on automatic syntactic analysis. The method shows better precision and recall than the baseline method (bigrams). It was tested on a text in Spanish. The method can be used for enrichment of very large dictionaries of word combinations.

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

تصحیح خودکار غلط های تایپی فارسی به کمک شبکه عصبی مصنوعی ترکیبی

Automatic correction of typos in the typed texts is one of the goals of research in artificial intelligence, data mining and natural language processing. Most of the existing methods are based on searching in dictionaries and determining the similarity of the dictionary entries and the given word. This paper presents the design, implementation, and evaluation of a Farsi typo correction system u...

متن کامل

Análisis Sintáctico para el Español Basado en el Formalismo de la Teoría Significado-Texto

The application of the Meaning ⇔ Text Theory to Spanish parsing is presented. This formalism is based on dependency grammars. The combinatorial dictionary of this method is employed for the syntactic analysis; it consists of patterns for words, mainly verbs, where all its valences and the way they are realized are described. In this method, no fixed word order in the sentence is considered so i...

متن کامل

Acquiring Syntactic Information for a Government Pattern Dictionary from Large Text Corpora

There are some research lines in automatic subcategorization frame acquisition and the importance of their work could not be doubted. However, almost all automatic work has been done in the constituent approach. Conversely, manual work is the traditional way for syntactic information acquisition in the dependency approach, which considers the correspondence between semantic valences and theirs ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004